Archimedes Optimization Algorithm with Deep Learning-Based Prostate Cancer Classification on Magnetic Resonance Imaging

Prostate cancer (PCa) is becoming one of the most frequently occurring cancers among men and causes an even greater number of deaths. Due to the complexity of tumor masses, radiologists find it difficult to identify PCa accurately. Over the years, several PCa-detecting methods have been formulated, but these methods cannot identify cancer efficiently. Artificial Intelligence (AI) has both information technologies that simulate natural or biological phenomena and human intelligence in addressing issues. AI technologies have been broadly implemented in the healthcare domain, including 3D printing, disease diagnosis, health monitoring, hospital scheduling, clinical decision support, classification and prediction, and medical data analysis. These applications significantly boost the cost-effectiveness and accuracy of healthcare services. This article introduces an Archimedes Optimization Algorithm with Deep Learning-based Prostate Cancer Classification (AOADLB-P2C) model on MRI images. The presented AOADLB-P2C model examines MRI images for the identification of PCa. To accomplish this, the AOADLB-P2C model performs pre-processing in two stages: adaptive median filtering (AMF)-based noise removal and contrast enhancement. Additionally, the presented AOADLB-P2C model extracts features via a densely connected network (DenseNet-161) model with a root-mean-square propagation (RMSProp) optimizer. Finally, the presented AOADLB-P2C model classifies PCa using the AOA with a least-squares support vector machine (LS-SVM) method. The simulation values of the presented AOADLB-P2C model are tested using a benchmark MRI dataset. The comparative experimental results demonstrate the improvements of the AOADLB-P2C model over other recent approaches.


Introduction
Prostate cancer (PCa) is a major factor for the increasing death rates in cancer and is most commonly identified among men [1]. In spite of its prevalence, PCa can be frequently non-aggressive, making it difficult to find. This results in the disease presenting higher risks to patients as to warrant treatments, such as radiation therapy or prostatectomies (prostate surgery) [2]. Histopathologically, Gleason grading can be a robust prognostic predictor

•
An intelligent AOADLB-P2C technique is presented, and it comprises AMF-based pre-processing, DenseNet-161-based feature extraction, RMSProp optimizer, LS-SVM classification, and AOA-based hyperparameter tuning. To the best of the researchers' knowledge, a AOADLB-P2C model has never been presented in the literature. • A RMSProp optimizer is applied in this study for the selection of hyperparameters involved in the DenseNet-161 model.

•
The parameter optimization of the LS-SVM model using the AOA algorithm and cross-validation helps in boosting the predictive outcome of the proposed model for unseen data.

•
The performance of the proposed model is validated using a PCa dataset. The rest of the paper is organized as follows. Section 2 provides a detailed review of the literature and Section 3 introduces the proposed model. Next, Section 4 offers the comprehensive analytical results and Section 5 concludes the work.

Related Works
The current section provides a detailed survey of existing models related to PCa classification. Liu et al. [14] modeled a DL method by combining the Inception-v3 and S-Mask region-based Convolutional Neural Network (R-CNN) models and used it in the ultrasound image-aided prognosis of PCa. The enhanced S-Mask R-CNN was leveraged to achieve a precise segmentation of the generated candidate regions and prostate ultrasound images. Furthermore, the RoI align method was also leveraged to realize the pixel-level feature point position. The respective binary mask of the prostate images was produced by the convolutional networks to distinguish the background and the prostate regions. In [15], a new automated classifier technique was modeled by merging several DL techniques to identify PCa from MRI and ultrasound (US) images. The devised technique also explains why a particular decision is given for the input MRI or US image. Numerous pretrained DL methods with custom-developed layers were included in this study on top of the corresponding pre-trained techniques and were implemented in the database.
Toledo-Cortés et al. [16] modeled a quantum-inspired deep probabilistic learning ordinal regression approach for medical image diagnosis. The proposed model leverages the advantages of intrinsic ordinal information of disease phases and representational power of DL techniques. The technique was assessed using two distinct medical image analyzing tasks, including the diabetic retinopathy grade estimation on eye fundus images and PCa diagnosis. Alam et al. [17] devised and authenticated several classifier methods on a supervised ML algorithm to forecast the occurrence of PCa. A modified LR method was modeled and applied on the images captured from patients who are vulnerable to PCa. This devised classifier method used various stages of tumor and clinical features. The clinical features included smoking history, BMI, cystitis infections, and age. Zhong et al. [18] conducted research work in which the main objective was to devise a Deep Transfer Learning (DTL)-oriented method for differentiating indolent lesions from clinically significant (CS) PCa lesions. The researchers compared the DTL-related method with a DL technique without TL and PIRADS v2 score over three Tesla multi-parametric MRI (3T mp-MRI) with Whole Mount Histopathology (WMHP) validations.
Wang and Wang [19] formulated a DL-oriented technique for automatic classification of clinically significant (CS) and clinically insignificant (CiS) PCa based on multiparametric MRI (mpMRI) images. Their study also intended to select suitable mpMRI series for PCa categorization in various anatomic zones. For the selection of optimal integration of the series for PCa classification in a particular zone and PCa classification, the researchers devised Multi-Input Selection Networks (MISNs). A MISN is a multi-input or multi-output classifier network that has a total of nine branches for processing nine input images from the mpMRI data. Poojitha and Sharma [20] discussed the saliency maps of images utilizing Deep Convolutional Generative Adversarial Networks (DCGANs) by implementing a semantic segmentation method with salient maps, offered by pathology specialists. This structure was modeled by integrated the fine-tuned VGGnet, CNNs, and RNNs. The authors presented a new method in which LSTM-RNN was used for sequential sub-band images of shearlet coefficients.
According to [21], novel urinary and serum biomarkers have been established in recent years. However, the researchers continuously search for novel biomarkers under different conditions and patient settings. In spite of these, there is a lack of particular rules with high level of evidence on the utilization of these markers. The count of the analysis, which focuses on the characterization of a particular PCa metabolic phenotype by utilizing distinct experimental methods, has been reported. In a study conducted earlier [22], a dual 5α-reductase inhibitor (5-ARI), i.e., dutasteride, was found, and it blocks testosterone from being converted into its active element, i.e., dihydrotestosterone (DHT), and decreases prostate volume, thus enhancing urinary flow rate. The Bipolar Transurethral Resection of the Prostate (B-TURP) process tends to enhance the outcomes and efficiency in comparison with the typical TURP, yet the incidence of side-effects in the B-TURP is lower.
Ferro et al. [23] reviewed existing evidence on the approaches and examinations conducted in existing literature on radiomics in PCa patients. The review also recommends its potential in terms of personalized medicine and future applications. The analysis conducted by Massanova et al. [24] related several means of PV estimate, comprising Digital Rectal Examination (DRE), Transrectal Ultrasound (TRUS), MRI, and radical prostatectomy specimens to determine the best method for volume measurement.
Several CAD models have been proposed in the literature for PCa classification process. Although several ML and DL models are available in the literature for liver cancer classification, there is a need to enhance the classification performance. Owing to the continuous exploration of this research domain, the number of parameters in DL models also increases quickly, which in turn results in model overfitting. At the same time, different hyperparameters exert a significant impact on the efficiency of CNN models. Particularly, hyperparameters, such as epoch count, batch size, and learning rate selection, are essential to attain an effectual outcome. Since the 'trial and error' method for hyperparameter tuning is a tedious and erroneous process, metaheuristic algorithms can be applied. Therefore, the current study employs the RMSProp optimizer for DenseNet and AOA algorithms in a LS-SVM model.

The Proposed Model
In this article, the authors propose a novel AOADLB-P2C method for PCa diagnosis from MRI images. Initially, the proposed AOADLB-P2C model pre-processes the MRI images in two stages, i.e., AMF-based noise removal and contrast enhancement. Moreover, the RMSProp optimizer with the DenseNet-161 model is applied for the purpose of feature extraction. Finally, the presented AOADLB-P2C model classifies the images for PCa using the AOA with LS-SVM models. Figure 1 shows the overall working process of the proposed AOADLB-P2C system.

Pre-Processing
At first, the proposed AOADLB-P2C model pre-processes the MRI images in two stages using AMF-based noise removal and contrast enhancement [25]. A simple MF is employed at the median of the window to replace the centre pixel that is regarded as a window. Once the central pixel is categorized as (salt) or (pepper), it is replaced by the middle value of the window. The major drawback of the standard MF is that, although the pixels are regarded as incorrect (except 0 or 255), it gets substituted with the median of the window. This reduces the whole visual quality of the image. In addition, it is not possible for an MF to preserve the edges. In general, the window is sorted in an ascending order. The median is deemed to be the middle value. Therefore, the undamaged pixels get substituted with the median value. The following stage is to employ the Contrast Limited Adaptive Histogram Equalization (CLAHE) approach on the noise-removed images. It has the best tractability in electing the local histogram map ping function. The CLAHE approach separates the images into suitable regions and enforces the histogram equalization approach upon them. Then, the clipped pixels are reassigned to every gray level. This novel histogram is unlike the typical histogram since the intensity of all the pixels gets constrained by the user-selectable maximum number. Hence, the CLAHE model might limit the noise enhancement outcome.

Pre-Processing
At first, the proposed AOADLB-P2C model pre-processes the MRI images in two stages using AMF-based noise removal and contrast enhancement [25]. A simple MF is employed at the median of the window to replace the centre pixel that is regarded as a window. Once the central pixel is categorized as (salt) or (pepper), it is replaced by the middle value of the window. The major drawback of the standard MF is that, although the pixels are regarded as incorrect (except 0 or 255), it gets substituted with the median of the window. This reduces the whole visual quality of the image. In addition, it is not possible for an MF to preserve the edges. In general, the window is sorted in an ascending order. The median is deemed to be the middle value. Therefore, the undamaged pixels get substituted with the median value. The following stage is to employ the Contrast Limited Adaptive Histogram Equalization (CLAHE) approach on the noise-removed images. It has the best tractability in electing the local histogram map ping function. The CLAHE approach separates the images into suitable regions and enforces the histogram equalization approach upon them. Then, the clipped pixels are reassigned to every gray level. This novel histogram is unlike the typical histogram since the intensity of all the pixels gets constrained by the user-selectable maximum number. Hence, the CLAHE model might

Feature Extraction
At this stage, the RMSProp optimizer is applied with the DenseNet-161 model for the purpose of feature extraction [26]. This model possesses dense convolution network (DenseNet) that accomplishes the maximum classification performance on large data, such as CIFAR-10 and ImageNet. The ResNet model acquires a duplicate feature map and contains several parameters, which make the training process a difficult one. On the other hand, the DenseNet model is comprised of thin layers and learns from a small number of feature maps, whereby every layer feeds data to the adjacent layer. The connecting model has more than one convolutional layer in a dense block of the DenseNet model. This feed-forward connection raises the overall number of layers from L to L(L + 1)/2. Consequently, the network gets trained in an effective manner, which not only reduces the over-fitting issue but also enhances the performance of the model. The DenseNet-161 model correspondingly comprises four dense blocks with (6, 12, 36, and 24) sub-blocks. Every sub-block has two convolutional layers that result in a total count of 156 layers with 5 convolutional layers having a growth rate of k = 48. Here, k represents the feature map.
In model compression, the transition layer plays a crucial role. If the number of channels in the output feature maps of the dense block is m, then the subsequent transition layer creates [θm] output feature map. Here, 0 < θ ≤ 1 represents the compression factor. When θ = 1, the number of feature mappings through the transition layer remains unchanged. This model is appropriate for embedded devices due to its thick compressibility and connectivity. With the help of the DenseNet-161 model as the pre-trained module, the knowledge (weight value) of the basic structure learnt in the first and middle layers gets transmitted to the presented method. The fundamental parameter, i.e., the pre-trained model, learns to categorize different objects in the ImageNet, and it has been utilized for the classification of the images of humans who carry baggage. Thus, the transfer learning fastens the training procedure and improves the novel CNN model.
To adjust the hyperparameters related to the DenseNet method, the RMSProp optimizer is exploited in this study [27]. The RMSprop is an optimization algorithm for neural networks that uses the magnitude of recent gradients to normalize the rest of the gradients. It is similar to the standard Stochastic Gradient Descent (SGD) algorithm, but it uses a moving average of the squared gradient values to scale the learning rate. This function helps in preventing oscillations and divergence in the optimization process. The RMSprop is often used in Deep Learning techniques and is known to work well with Recurrent Neural Networks (RNNs). The RMSprop is an enhanced model of Adagrad, and its upgrading process is similar to that of the Adagrad optimizer. It calculates the exponential decay average of the squared gradient as given below: In Equation (1), β denotes the decaying rate that is usually set as 0.9. In addition, the upgraded values of the parameter from the RMSprop are same as the Adagrad: Furthermore, the simplified model of the Adagrad is given below: The upgraded value of the RMSprop is defined by Equation (4) as given below: Hence, the RMSprop is an enhanced method based on the gradient. To conduct the analysis, the rate of learning optimized approach is employed to enhance the training efficacy.

Prostate Cancer Classification
For PCa detection, the LS-SVM model is used. The LS-SVM is a well-organized ML algorithm that functions on the basis of statistical learning approach developed by Vapnik. The LSSVM algorithm effectively overcomes the higher-dimensional nonlinear and local minimal problems [28]. It can be established in a SVM method with two other features. Initially, inequality is constrained by the equality constraints after which it transmits two programming problems to the linear expression. These additional features accelerate the computational time of the LSSVM on the SVM method. Figure 2 demonstrates the structure of the SVM hyperplane.
algorithm that functions on the basis of statistical learning approach developed by Vapnik. The LSSVM algorithm effectively overcomes the higher-dimensional nonlinear and local minimal problems [28]. It can be established in a SVM method with two other features. Initially, inequality is constrained by the equality constraints after which it transmits two programming problems to the linear expression. These additional features accelerate the computational time of the LSSVM on the SVM method. Figure 2 demonstrates the structure of the SVM hyperplane. The learning process of the LSSVM is shown below. ( , ), = 1,2, … , , where represents the predictor parameter and denotes the outcome parameter. The linear regression function is given below: In Equation (5), and correspondingly determine the weight vector and the deviation.
In Equation (6), and correspondingly indicate the error penalty and the slack parameter. Based on Mercer's condition, the linear kernel purpose is selected as shown below: The learning process of the LSSVM is shown below. (x i , y i ), i = 1, 2, . . . , n, where x i represents the i th predictor parameter and y i denotes the i th outcome parameter. The linear regression function is given below: In Equation (5), w and b correspondingly determine the weight vector and the deviation.
In Equation (6), C and ζ i correspondingly indicate the error penalty and the slack parameter. Based on Mercer's condition, the linear kernel purpose is selected as shown below: The final LSSVM approach is calculated based on the following expression: where a i indicates the Lagrangian multiplier.
To adjust the LS-SVM parameters, the current research work utilizes the AOA. Hashim et al. [29] proposed the AOA algorithm based on the Archimedes' principle that is deemed to be the law of physics. This principle focuses on an object that is immersed, either fully or partly in a fluid. In general, there exists an upward force (termed 'buoyancy') exerted by the liquid on the body. This force corresponds to the weight of the fluid displaced by the body. In this work, immersed objects are deemed to be an individual population (i.e., solution candidates). This technique begins with the population initialization and, along with that, the position of each object is arbitrarily initiated inside the problem searching spaces. Then, their respective Fitness Function (FF) is evaluated. In the iteration model, the AOA upgrades the object density (den i ) and volume (vol i ) of the i th object once the acceleration is upgraded based on the collision with a neighboring object. The position of all the objects (O i ) is defined as follows: where l i and u i characterize the lower and upper limits of the i th object, and N represents the object amount.
Consider that rand represents a random number in the range of {0, 1}, the acceleration (acc i ) of the i th objective is initialized as follows: The initial FF is assessed, and the object with the optimal fitness is allotted as x best , den best , vol best , and acc best .
The updating methods of the i th object density and volume for the iteration t + 1 are formulated as follows: In these expressions, t denotes the current iteration and rand denotes a random number. At first, there exists a collision amongst the objects that try to accomplish the equilibrium state. This process is presented in the AOA through the Transfer Operator (TF) that assists in the transformation from the exploration stage to the exploitation stage: In this equation, t represents the iteration number and t max denotes the maximal iteration count. Here, the TF values get increased gradually. The density reducing factor (d) helps the AOA to transfer from the global to the local search space: The values of d t+1 reduce with time while, on the other hand, the proper allocation of these parameters helps in attaining a fine balance between the exploitation and exploration stages [30]. The exploration stage is signified as the collision among the objects, while the TF value is 0.5. The acceleration of the i th object at t + 1 iterations is upgraded as follows: Here, acc mr den mr , and vol mr , indicate the acceleration, density, and volume of the random material (mr). The exploitation phase in the AOA assumes that there is no collision among the objects. If TF > 0.5, there is no collision between the objects while an object's acceleration is updated for the iteration t + 1 using Equation (17): Healthcare 2023, 11, 590 9 of 17 In Equation (17), acc best represents the acceleration of an object that is optimal with respect to fitness function. It is crucial to normalize the acceleration of each particle as it defines the step percentage that every particle might change and is given in the following expression: In this expression, l and u characterize the normalization range while the values that are correspondingly allotted are 0.1 and 0.9. When an object is further away from the global optima, then the value of the acceleration would be higher. The exploration phase is conducted or else the exploitation phase is implemented.
The position of the i th particle gets upgraded in the exploration phase as follows: The location updating in the exploitation phase is formulated as follows: C 1 is equal to 2, whereas C 2 is equal to 6. T increases with time and is directly proportional to the Transfer Operator. It is defined using the equation, T = C 3 × TF. C 3 shows a constant number, x best indicates the place of the optimum particle, and F represents the flag that is utilized to change the direction of particle movements. It can be determined using the following expression: where P = 2 × rand − C 4 . Finally, the FF values are evaluated at the upgraded location. The AOA methodology processes a FF to realize the enhanced classification results. It describes a positive integer to achieve a good performance of the candidate results. In this case, a minimized classifier error rate is assumed to be the FF as expressed in Equation (22)

Results and Discussion
The proposed model was simulated using the Python 3.6.5 tool on a PC configured with the following specifications: i5-8600k, GeForce 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parameter settings are learning rate: 0.01, dropout: 0.5, batch size: 5, epoch count: 50, and activation: ReLU. The current section discusses the PCa classification results achieved by the proposed AOADLB-P2C model. The model was tested on a dataset comprising 400 samples under two classes, as defined in Table 1. Figure 3 depicts some of the sample images used in this study.    The confusion matrices generated by the proposed AOADLB-P2C model during different runs of execution are displayed in Figure 4. The figure highlights that the proposed AOADLB-P2C method recognized the images under the two classes accurately. Table 2 and Figure 5 show the overall classification outcomes achieved by the proposed AOADLB-P2C model under five runs. The outcomes imply that the presented AOADLB-P2C model achieves enhanced outcomes under each run. For example, on run-1, the AOADLB-P2C model acquires an average accu bal of 93%, sens y of 93%, spec y of 93%, F score of 93%, and a MCC of 86.07%. Meanwhile, on run-3, the proposed AOADLB-P2C method obtains an average accu bal of 95%, sens y of 95%, spec y of 95%, F score of 95%, and a MCC of 90%. Eventually, on run-5, the proposed AOADLB-P2C approach acquires an average accu bal of 99.50%, sens y of 99.50%, spec y of 99.50%, F score of 99.50%, and a MCC of 99%.    Figure 5 show the overall classification outcomes achieved by the proposed AOADLB-P2C model under five runs. The outcomes imply that the presented AOADLB-P2C model achieves enhanced outcomes under each run. For example, on run-1, the AOADLB-P2C model acquires an average of 93%, of 93%, of 93%, of 93%, and a MCC of 86.07%. Meanwhile, on run-3, the proposed AOADLB-P2C method obtains an average of 95%, of 95%, of 95%, of 95%, and a MCC of 90%. Eventually, on run-5, the proposed AOADLB-P2C approach acquires an average of 99.50%, of 99.50%, of 99.50%, of 99.50%, and a MCC of 99%.   The TACC and VACC values achieved by the proposed AOADLB-P2C technique on PCa performance are shown in Figure 6. The figure infers that the proposed AOADLB-P2C approach exhibits an improved performance with increased TACC and VACC values. It is also noted that the AOADLB-P2C algorithm reaches the maximum TACC outcomes.  The TACC and VACC values achieved by the proposed AOADLB-P2C technique on PCa performance are shown in Figure 6. The figure infers that the proposed AOADLB-P2C approach exhibits an improved performance with increased TACC and VACC values. It is also noted that the AOADLB-P2C algorithm reaches the maximum TACC outcomes. Healthcare 2023, 11, x 13 of 17     In Table 3, the outcomes of the comprehensive comparative study achieved by the proposed AOADLB-P2C method and other recent models are given [10]. Figure 8 reports the results from the comparative accu y and F score inspection, attained by the proposed AOADLB-P2C method and other techniques. The results specify that the AOADLB-P2C method accomplish superior results over other models. In terms of accu y , the AOADLB-P2C model obtains a maximum accu y of 99.50% while the other models, including the NB, DT, SVM-Gaussian, SVM-RBF, and GoogleNet, obtain lower accu y values of 98.46%, 97.29%, 98.36%, 98.12%, and 98.28% respectively. In addition, based on the F score , the AOADLB-P2C technique achieves the highest F score of 99.50%, while the NB, DT, SVM-Gaussian, SVM-RBF, and GoogleNet approaches achieve lower F score values of 98.81%, 98.83%, 97.91%, 98.52%, and 98.69%, respectively. In Table 3, the outcomes of the comprehensive comparative study achieved by the proposed AOADLB-P2C method and other recent models are given [10]. Figure 8 Figure 9 illustrate the sens y and spec y analytical outcomes accomplished by the proposed AOADLB-P2C methodology and other existing approaches. The outcomes show that the AOADLB-P2C approach exhibits a superior performance over other techniques. In terms of sens y , the proposed AOADLB-P2C methodology attains a maximum sens y of 99.50%, while the NB, DT, SVM-Gaussian, SVM-RBF, and GoogleNet methods attain lower sens y values of 98.47%, 97.26%, 98.43%, 98.63%, and 98.28%, respectively. Likewise, in terms of spec y , the AOADLB-P2C technique attains a maximum spec y of 99.50%, whereas the NB, DT, SVM-Gaussian, SVM-RBF, and GoogleNet approaches attain lower spec y values of 98.641%, 98.47%, 98.54%, 97.89%, and 98.49%, respectively.  These results confirm the improved outcomes of the proposed AOADLB-P2C model on prostate cancer classification process. The enhanced performance of the proposed model is due to the inclusion of AMF-based pre-processing, DenseNet-161-based feature extraction, RMSProp optimizer, LS-SVM classification, and AOA-based hyperparameter tuning. Therefore, the proposed model can be employed for accurate PCa detection and classification using MRI images.

Conclusions
In this research work, the authors introduced a new AOADLB-P2C model for PCa diagnosis using MRI images. Initially, the AOADLB-P2C model pre-processes MRI images in two stages using AMF-based noise removal and contrast enhancement. Moreover, the RMSProp optimizer is applied with the DenseNet-161 model for the purpose of feature extraction. Finally, the presented AOADLB-P2C model classifies PCa using the AOA with the LS-SVM method. The presented AOADLB-P2C method was experimentally evaluated using a benchmark MRI dataset. The comparative simulation values confirmed the improved performance of the proposed AOADLB-P2C methodology over other recent methodologies, with a maximum accuracy of 99.50%. Hence, the presented AOADLB-P2C These results confirm the improved outcomes of the proposed AOADLB-P2C model on prostate cancer classification process. The enhanced performance of the proposed model is due to the inclusion of AMF-based pre-processing, DenseNet-161-based feature extraction, RMSProp optimizer, LS-SVM classification, and AOA-based hyperparameter tuning. Therefore, the proposed model can be employed for accurate PCa detection and classification using MRI images.

Conclusions
In this research work, the authors introduced a new AOADLB-P2C model for PCa diagnosis using MRI images. Initially, the AOADLB-P2C model pre-processes MRI images in two stages using AMF-based noise removal and contrast enhancement. Moreover, the RMSProp optimizer is applied with the DenseNet-161 model for the purpose of feature extraction. Finally, the presented AOADLB-P2C model classifies PCa using the AOA with the LS-SVM method. The presented AOADLB-P2C method was experimentally evaluated using a benchmark MRI dataset. The comparative simulation values confirmed the improved performance of the proposed AOADLB-P2C methodology over other recent methodologies, with a maximum accuracy of 99.50%. Hence, the presented AOADLB-P2C method can be employed for PCa classification. In the future, the efficiency of the AOADLB-P2C model can be improved using ensemble fusion models. Data Availability Statement: Data sharing is not applicable to this article as no datasets were generated during the current study.